AITopics | block world

Collaborating Authors

block world

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

What Planning Problems Can A Relational Neural Network Solve?

Neural Information Processing SystemsFeb-16-2026, 19:27:12 GMT

Goal-conditioned policies are generally understood to be "feed-forward" circuits,

artificial intelligence, machine learning, regression rule, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.85)

Add feedback

Thinking Isn't an Illusion: Overcoming the Limitations of Reasoning Models via Tool Augmentations

Song, Zhao, Yue, Song, Zhang, Jiahao

arXiv.org Artificial IntelligenceJul-24-2025

Large Reasoning Models (LRMs) have become a central focus in today's large language model (LLM) research, where models are designed to output a step-by-step thinking process before arriving at a final answer to handle complex reasoning tasks. Despite their promise, recent empirical studies (e.g., [Shojaee et al., 2025] from Apple) suggest that this thinking process may not actually enhance reasoning ability, where LLMs without explicit reasoning actually outperform LRMs on tasks with low or high complexity. In this work, we revisit these findings and investigate whether the limitations of LRMs persist when tool augmentations are introduced. We incorporate two types of tools, Python interpreters and scratchpads, and evaluate three representative LLMs and their LRM counterparts on Apple's benchmark reasoning puzzles. Our results show that, with proper tool use, LRMs consistently outperform their non-reasoning counterparts across all levels of task complexity. These findings challenge the recent narrative that reasoning is an illusion and highlight the potential of tool-augmented LRMs for solving complex problems.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.17699

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Comment on The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Lawsen, A.

arXiv.org Artificial IntelligenceJun-18-2025

Shojaee et al. (2025) report that Large Reasoning Models (LRMs) exhibit "accuracy collapse" on planning puzzles beyond certain complexity thresholds. We demonstrate that their findings primarily reflect experimental design limitations rather than fundamental reasoning failures. Our analysis reveals three critical issues: (1) Tower of Hanoi experiments risk exceeding model output token limits, with models explicitly acknowledging these constraints in their outputs; (2) The authors' automated evaluation framework fails to distinguish between reasoning failures and practical constraints, leading to misclassification of model capabilities; (3) Most concerningly, their River Crossing benchmarks include mathematically impossible instances for N > 5 due to insufficient boat capacity, yet models are scored as failures for not solving these unsolvable problems. When we control for these experimental artifacts, by requesting generating functions instead of exhaustive move lists, preliminary experiments across multiple models indicate high accuracy on Tower of Hanoi instances previously reported as complete failures. These findings highlight the importance of careful experimental design when evaluating AI reasoning capabilities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.0925

Country: Asia > Vietnam > Hanoi > Hanoi (0.49)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Logic-of-Thought: Empowering Large Language Models with Logic Programs for Solving Puzzles in Natural Language

Li, Naiqi, Liu, Peiyuan, Liu, Zheng, Dai, Tao, Jiang, Yong, Xia, Shu-Tao

arXiv.org Artificial IntelligenceMay-23-2025

Solving puzzles in natural language poses a long-standing challenge in AI. While large language models (LLMs) have recently shown impressive capabilities in a variety of tasks, they continue to struggle with complex puzzles that demand precise reasoning and exhaustive search. In this paper, we propose Logic-of-Thought (Logot), a novel framework that bridges LLMs with logic programming to address this problem. Our method leverages LLMs to translate puzzle rules and states into answer set programs (ASPs), the solution of which are then accurately and efficiently inferred by an ASP interpreter. This hybrid approach combines the natural language understanding of LLMs with the precise reasoning capabilities of logic programs. We evaluate our method on various grid puzzles and dynamic puzzles involving actions, demonstrating near-perfect accuracy across all tasks. Our code and data are available at: https://github.com/naiqili/Logic-of-Thought.

large language model, machine learning, puzzle, (19 more...)

arXiv.org Artificial Intelligence

2505.16114

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

No Plan but Everything Under Control: Robustly Solving Sequential Tasks with Dynamically Composed Gradient Descent

Mengers, Vito, Brock, Oliver

arXiv.org Artificial IntelligenceMar-3-2025

We introduce a novel gradient-based approach for solving sequential tasks by dynamically adjusting the underlying myopic potential field in response to feedback and the world's regularities. This adjustment implicitly considers subgoals encoded in these regularities, enabling the solution of long sequential tasks, as demonstrated by solving the traditional planning domain of Blocks World - without any planning. Unlike conventional planning methods, our feedback-driven approach adapts to uncertain and dynamic environments, as demonstrated by one hundred real-world trials involving drawer manipulation. These experiments highlight the robustness of our method compared to planning and show how interactive perception and error recovery naturally emerge from gradient descent without explicitly implementing them. This offers a computationally efficient alternative to planning for a variety of sequential tasks, while aligning with observations on biological problem-solving strategies.

interconnection, regularity, subgoal, (14 more...)

arXiv.org Artificial Intelligence

2503.01732

Country:

Europe > Germany > Berlin (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.73)

Add feedback

Repairs in a Block World: A New Benchmark for Handling User Corrections with Multi-Modal Language Models

Chiyah-Garcia, Javier, Suglia, Alessandro, Eshghi, Arash

arXiv.org Artificial IntelligenceOct-4-2024

In dialogue, the addressee may initially misunderstand the speaker and respond erroneously, often prompting the speaker to correct the misunderstanding in the next turn with a Third Position Repair (TPR). The ability to process and respond appropriately to such repair sequences is thus crucial in conversational AI systems. In this paper, we first collect, analyse, and publicly release BlockWorld-Repairs: a dataset of multi-modal TPR sequences in an instruction-following manipulation task that is, by design, rife with referential ambiguity. We employ this dataset to evaluate several state-of-the-art Vision and Language Models (VLM) across multiple settings, focusing on their capability to process and accurately respond to TPRs and thus recover from miscommunication. We find that, compared to humans, all models significantly underperform in this task. We then show that VLMs can benefit from specialised losses targeting relevant tokens during fine-tuning, achieving better performance and generalising better to new scenarios. Our results suggest that these models are not yet ready to be deployed in multi-modal collaborative settings where repairs are common, and highlight the need to design training regimes and objectives that facilitate learning from interaction. Our code and data are available at www.github.com/JChiyah/blockworld-repairs

dialogue, instruction, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2409.14247

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Dominican Republic (0.04)
(24 more...)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(4 more...)

Add feedback

What Planning Problems Can A Relational Neural Network Solve?

Mao, Jiayuan, Lozano-Pérez, Tomás, Tenenbaum, Joshua B., Kaelbling, Leslie Pack

arXiv.org Machine LearningDec-6-2023

Goal-conditioned policies are generally understood to be "feed-forward" circuits, in the form of neural networks that map from the current state and the goal specification to the next action to take. However, under what circumstances such a policy can be learned and how efficient the policy will be are not well understood. In this paper, we present a circuit complexity analysis for relational neural networks (such as graph neural networks and transformers) representing policies for planning problems, by drawing connections with serialized goal regression search (S-GRS). We show that there are three general classes of planning problems, in terms of the growth of circuit width and depth as a function of the number of objects and planning horizon, providing constructive proofs. We also illustrate the utility of this analysis for designing neural networks for policy learning.

artificial intelligence, machine learning, regression rule, (18 more...)

arXiv.org Machine Learning

2312.03682

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Hot papers on arXiv from the past month: April 2021

AIHubMay-5-2021, 10:12:01 GMT

Here are the most tweeted papers that were uploaded onto arXiv during April 2021. Results are powered by Arxiv Sanity Preserver. Representation Learning for Networks in Biology and Medicine: Advancements, Challenges, and Opportunities Michelle M. Li, Kexin Huang, Marinka Zitnik Submitted to arXiv on: 11 April 2021 Abstract: With the remarkable success of representation learning in providing powerful predictions and data insights, we have witnessed a rapid expansion of representation learning techniques into modeling, analysis, and learning with networks. Biomedical networks are universal descriptors of systems of interacting elements, from protein interactions to disease networks, all the way to healthcare systems and scientific knowledge. In this review, we put forward an observation that long-standing principles of network biology and medicine -- while often unspoken in machine learning research -- can provide the conceptual grounding for representation learning, explain its current successes and limitations, and inform future advances.

accuracy, arxiv, submitted, (15 more...)

AIHub

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Hitting the Books: The Brooksian revolution that led to rational robots

EngadgetFeb-27-2021, 16:30:56 GMT

We are living through an AI renaissance thought wholly unimaginable just a few decades ago -- automobiles are becoming increasingly autonomous, machine learning systems can craft prose nearly as well as human poets, and almost every smartphone on the market now comes equipped with an AI assistant. Oxford professor Michael Woolridge has spent the past quarter decade studying technology. In his new book, A Brief History of Artificial Intelligence, Woolridge leads readers on an exciting tour of the history of AI, its present capabilities, and where the field is heading into the future. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher. In his 1962 book, The Structure of Scientific Revolutions, the philosopher Thomas Kuhn argued that, as scientific understanding advances, there will be times when established scientific orthodoxy can no longer hold up under the strain of manifest failures.

block world, brook, reasoning, (14 more...)

Engadget

Country: Oceania > Australia (0.05)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.55)

Add feedback

Filters

Collaborating Authors

block world

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

What Planning Problems Can A Relational Neural Network Solve?

ba90e56a74fd77d0ddec033dc199f0fa-Paper-Conference.pdf

Thinking Isn't an Illusion: Overcoming the Limitations of Reasoning Models via Tool Augmentations

Comment on The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Logic-of-Thought: Empowering Large Language Models with Logic Programs for Solving Puzzles in Natural Language

No Plan but Everything Under Control: Robustly Solving Sequential Tasks with Dynamically Composed Gradient Descent

Repairs in a Block World: A New Benchmark for Handling User Corrections with Multi-Modal Language Models

What Planning Problems Can A Relational Neural Network Solve?

Hot papers on arXiv from the past month: April 2021

Hitting the Books: The Brooksian revolution that led to rational robots